Author : Indumathi Pandiyan

Computer Vision Project submitted for PGP-AIML Great Learning on 01-May-2022

PART A

DOMAIN: Botanical Research
CONTEXT:University X is currently undergoing some research involving understanding the characteristics of plant and plant seedlings at various stages of growth. They already have have invested on curating sample images. They require an automation which can create a classifier capable of determining a plant's species from a photo.

DATA DESCRIPTION: The dataset comprises of images from 12 plant species. Source: https://www.kaggle.com/c/plant-seedlings-classification/data

• PROJECT OBJECTIVE: : To create a classifier capable of determining a plant's species from a photo

Steps and tasks: [ Total Score: 30 Marks]

1.Import and Understand the data [12 Marks]

Import required libraries

A. Extract ‘plant-seedlings-classification.zip’ into new folder (unzipped) using python. [2 Marks]

Hint: You can extract it Manually by losing 2 marks.

Comments:

unzipped folder got created in the file structure all the files are extracted there

B.Map the images from train folder with train labels to form a DataFrame. [6 Marks]

Comments

Successfully able to create a dataframe with actual image,name of the image(image_name) image path and corresponding species_name

Printing a single image

C. Write a function that will select n random images and display images along with its species. [4 Marks]

Hint: If input for function is 5, it should print 5 random images along with its labels. 2.

Method to print n random images from the data set

This method accepts number of random images and print it by retrieving the images stored in the dataset

Method invocation by specifying the number of images as 5

2.Data preprocessing [8 Marks]

A. Create X & Y from the DataFrame. [2 Marks]

B. Encode labels of the images. [2 Marks]

C. Unify shape of all the images. [2 Marks]

Method to Unify the image shapes

D. Normalise all the images. [2 Marks]

Comments:

Through Normalize all the images are normalized now ready for model building

3. Model Training [10 Marks]

A.Split the data into train and test data. [2 Marks]

B. Create new CNN architecture to train the model. [4 Marks]

C. Train the model on train data and validate on test data. [2 Marks]

Comments:

The graph shows good accuracy improves the training and test accuracy. The training is done for 25 epochs the training accuracy reached almost 91.55% Where as test accuracy reached 80%

Comments

As per the confusion matrix the model can predict well. Misclassification happens for class 6.

D. Select a random image and print actual label and predicted label for the same. [2 Marks]

Getting a Random Number and printing its image and corresponding label

Method to predict the image

Loading the Same image for Prediction

Verifying the Predicted Label is as per the actual

Conclusion:

PART B

DOMAIN: Botanical Research
CONTEXT:University X is currently undergoing some research involving understanding the characteristics of flowers. They already have have invested on curating sample images. They require an automation which can create a classifier capable of determining a flower’s species from a photo.

DATA DESCRIPTION: The dataset comprises of images from 17 plant species.

• PROJECT OBJECTIVE: :To experiment with various approaches to train an image classifier to predict type of flower from the image.

Steps and tasks: [ Total Score: 30 Marks]

1. Import and Understand the data [5 Marks]

A.Import and read oxflower17 dataset from tflearn and split into X and Y while loading. [2 Marks]

Comments:
The data is loaded from oxflower data set and split into X and y successfully.

B. Print Number of images and shape of the images. [1 Marks ]

Observation

Image shape of 224*224 is available

C. Print count of each class from y. [2 Marks]

Unique Class names

Observation

Printing the individual count with collection object

Printing the individual count with Numpy

Comments:

The individual class count is printed and also count plot is shown to display the counts. Every class from 0 to 16 has count of 80.

2. Image Exploration & Transformation [Learning purpose - Not related to final model] [10 Marks]

A. Display 5 random images. [1 Marks]

Comments
The 5 random images are displayed .

For better understanding the data viewing more images

B. Select any image from the dataset and assign it to a variable. [1 Marks]

Comments
An image is stored in a variable

C. Transform the image into grayscale format and display the same. [3 Marks]

D. Apply a filter to sharpen the image and display the image before and after sharpening. [2 Marks]

E. Apply a filter to blur the image and display the image before and after blur. [2 Mark]

F. Display all the 4 images from above questions besides each other to observe the difference. [1 Marks

Observation:

3. Model training and Tuning: [15 Marks]

A. Split the data into train and test with 80:20 proportion. [2 Marks]

3.Train a model using any Supervised Learning algorithm and share performance metrics on test data. [3 Marks]

Supervised learning models will need 2 Dimension

data is already normalized hence reshaping for Supervised model

Logistic Regression for Classification

Classification report for Logistic Regression

KNeighor classifier

Build SVM model

Comments and Observation:

C. Train a model using Neural Network and share performance metrics on test data. [4 Marks]

Reshape train and test sets into compatible shapes

Image Classification

Convert the labels into set of 16 classes to feed to Neural Network

As pixels values are already normalized no normalization required

Create Neural Network Model

Define a method to create ANN model

D. Train a model using a basic CNN and share performance metrics on test data. [4 Marks]

Visualize accuracy

Confusion matrix

E. Predict the class/label of image ‘Prediction.jpg’ using best performing model and share predicted label. [2 Marks]

Our model can predict the flower correctly as label 2

Conclusion

In this project, we learned to use different approaches to train and classify the Oxflower dataset. Among various models, tried CNN performed better in Test Data. Maximum accuracy of 61.20% achieved. Where as train accuracy is 98.98%.

Advantages of Convolutional Neural network

The CNN model Able to predict the prediction class correctly, that is been verified by verifying the same image in the data set with its label.